智能论文笔记

Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation

Zhen Tian , Ting Bai , Zibin Zhang , Zhiyuan Xu , Kangyi Lin , Ji-Rong Wen , Wayne Xin Zhao

分类：机器学习

2022-11-21

With the growth of high-dimensional sparse data in web-scale recommender systems, the computational cost to learn high-order feature interaction in CTR prediction task largely increases, which limits the use of high-order interaction models in real industrial applications. Some recent knowledge distillation based methods transfer knowledge from complex teacher models to shallow student models for accelerating the online model inference. However, they suffer from the degradation of model accuracy in knowledge distillation process. It is challenging to balance the efficiency and effectiveness of the shallow student models. To address this problem, we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model DAGFM can learn arbitrary explicit feature interactions from teacher networks, which achieves approximately lossless performance and is proved by a dynamic programming algorithm. Besides, an improved general model KD-DAGFM+ is shown to be effective in distilling both explicit and implicit feature interactions from any complex teacher model. Extensive experiments are conducted on four real-world datasets, including a large-scale industrial dataset from WeChat platform with billions of feature dimensions. KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments, showing the superiority of DAGFM to deal with the industrial scale data in CTR prediction task. Our implementation code is available at: https://github.com/RUCAIBox/DAGFM.

translated by 谷歌翻译

PetLock:A Genderless and Standard Interface for the Future On-orbit Construction

Yuntao Li , Zichun Xu , Xiaohang Yang , Zhiyuan Zhao , Jingdong Zhao , Hong Liu

分类：机器人

2022-09-09

模块化设计是未来大型空间设施的On On On构造技术的基础。标准界面是未来空间机器人系统和空间设施模块化设计的关键技术。本文介绍了Petlock的设计和测试，标准和测试无性别界面可以在未来的模块化空间机器人操纵器和航天器之间传递机械载荷，功率和数据。Petlock采用完全无性别的设计，包括连接面，锁定机制，数据和功率接口。连接表面提供了较大的翻译和旋转错位耐受性，由于其120度对称和3D形状的设计。锁定机制具有三个锁定引脚撤回结构设计，这是简单可靠的。高锁定力，高容忍度，高可靠性和低成本的优势，Petloc K在未来的轨道施工任务中具有很大的应用潜力。

translated by 谷歌翻译

A Combined Inverse Kinematics Algorithm Using FABRIK with Optimization

Zichun Xu , Yuntao Li , Xiaohang Yang , Zhiyuan Zhao , Jingdong Zhao , Hong Liu

分类：机器人

2022-09-06

向前和向后触及逆运动学（FABRIK）是一种启发式逆运动求解器，逐渐应用于具有快速收敛和生成更真实配置的优势的操纵器。但是，在高误差限制下，Fabrik表现出不稳定的收敛行为，这对于操纵器的实时运动计划是不满意的。在本文中，提出了一种结合Fabrik和顺序二次编程（SQP）算法的新型逆运动学算法，其中Fabrik推迟的关节角度将被视为SQP算法的初始种子，以避免粘在局部最小值中。通过实验评估合并的算法，在高误差约束下，我们的算法比FabRik获得更高的成功率和更快的解决方案时间。此外，联合算法可以在路径跟踪中为UR5和KUKA LBR IIWA 14 R820操纵器生成连续轨迹，而无姿势误差和最终效应器的允许位置误差。

translated by 谷歌翻译

CVFNet: Real-time 3D Object Detection by Learning Cross View Features

Jiaqi Gu , Zhiyu Xiang , Pan Zhao , Tingming Bai , Lingxuan Wang , Xijun Zhao , Zhiyuan Zhang

分类：计算机视觉

2022-03-13

近年来，由于深度学习技术的发展，LiDar Point Clouds的3D对象检测取得了长足的进步。尽管基于体素或基于点的方法在3D对象检测中很受欢迎，但它们通常涉及耗时的操作，例如有关体素的3D卷积或点之间的球查询，从而使所得网络不适合时间关键应用程序。另一方面，基于2D视图的方法具有较高的计算效率，而通常比基于体素或基于点的方法获得的性能低。在这项工作中，我们提出了一个基于实时视图的单阶段3D对象检测器，即CVFNET完成此任务。为了在苛刻的效率条件下加强跨视图的学习，我们的框架提取了不同视图的特征，并以有效的渐进式方式融合了它们。我们首先提出了一个新颖的点范围特征融合模块，该模块在多个阶段深入整合点和范围视图特征。然后，当将所获得的深点视图转换为鸟类视图时，特殊的切片柱旨在很好地维护3D几何形状。为了更好地平衡样品比率，提出了一个稀疏的柱子检测头，将检测集中在非空网上。我们对流行的Kitti和Nuscenes基准进行了实验，并以准确性和速度来实现最先进的性能。

translated by 谷歌翻译

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

Yuan Yao , Qingxiu Dong , Jian Guan , Boxi Cao , Zhengyan Zhang , Chaojun Xiao , Xiaozhi Wang , Fanchao Qi , Junwei Bao , Jinran Nie

分类：自然语言处理

2021-12-27

实现通用语言情报是自然语言处理的长期目标，标准评估基准发挥基本和指导作用。我们认为，对于通用语言智能评估，基准本身需要全面和系统。为此，我们提出了Cuge，一种中文语言理解和生成评估基准，具有以下特征：（1）分层基准框架，其中数据集主要选择和组织语言能力 - 任务数据集层次结构。（2）多级评分策略，其中基于分层框架提供了不同级别的模型性能。为了促进CUGE，我们提供了一个公共排行榜，可以自定义，以支持灵活的模型判断标准。代表性预先训练的语言模型的评估结果表明了对通用语言智能的完善的充足空间。 Cuge在Cuge.baai.ac.cn上公开提供。

translated by 谷歌翻译

Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Mansur Arief , Yuanlu Bai , Wenhao Ding , Shengyi He , Zhiyuan Huang , Henry Lam , Ding Zhao

分类： (统计)机器学习

2021-11-03

稀有事件仿真技术，如重要采样（是），构成强大的工具，以加速罕见灾难性事件的具有挑战性的估算。这些技术经常利用底层系统结构的知识和分析，以赋予赋予理想的效率保证。然而，黑匣子问题，特别是来自最近AI驱动的物理系统的安全关键型应用的问题，可以从根本上破坏他们的效率担保，并在没有诊断地检测的情况下导致危险的估计。我们提出了一个框架，称为深度概率加速评估（Deep-Prae）来设计统计保障是通过转换多功能的黑匣子采样器，但可能缺乏保证，以便我们称之为放松的效率证明，允许准确估计界限。论罕见事件概率。我们介绍了深度PRAE理论，将主导点概念与稀有事件集合通过深度神经网络分类器进行了学习，并证明了其在数值例子中的有效性，包括智能驾驶算法的安全测试。

translated by 谷歌翻译

OpenPrompt: An Open-source Framework for Prompt-learning

Ning Ding , Shengding Hu , Weilin Zhao , Yulin Chen , Zhiyuan Liu , Hai-Tao Zheng , Maosong Sun

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-03

快速学习已成为现代自然语言处理的新范式，它直接适应培训的语言模型（PLMS）到$ CLOZE $ -Style预测，自回归建模或序列到序列生成，从而导致各种任务的表现。但是，尚未提出及时学习的标准实施框架，以及大多数现有的及时学习码条，通常是不受管制的，仅为特定方案提供有限的实现。由于有许多细节，例如模板策略，初始化策略和语言化策略等，因此需要在快速学习中考虑，从业者面临障碍，以便快速调整所需的迅速学习方法到他们的应用程序。在本文中，我们展示了{OpenPrompt}，一个统一的易于使用的工具包，可以通过PLMS快速学习。 OpenPrompt是一项研究型框架，配备了效率，模块化和可扩展性，其组合性允许自由地将不同的PLMS，任务格式和提示模块组合在统一的范例中。用户可以宽松地部署快速学习框架，并在没有约束的情况下在不同的NLP任务上评估它们的泛化。 OpenPrompt在{\ url {https://github.com/thunlp/openprompt}}上公开发布。

translated by 谷歌翻译

FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

Zhiyuan Wu , Sheng Sun , Yuwei Wang , Min Liu , Xuefeng Jiang , Bo Gao

分类：机器学习

2023-01-01

The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.

translated by 谷歌翻译

NEEDED: Introducing Hierarchical Transformer to Eye Diseases Diagnosis

Xu Ye , Meng Xiao , Zhiyuan Ning , Weiwei Dai , Wenjuan Cui , Yi Du , Yuanchun Zhou

分类：自然语言处理

2022-12-27

With the development of natural language processing techniques(NLP), automatic diagnosis of eye diseases using ophthalmology electronic medical records (OEMR) has become possible. It aims to evaluate the condition of both eyes of a patient respectively, and we formulate it as a particular multi-label classification task in this paper. Although there are a few related studies in other diseases, automatic diagnosis of eye diseases exhibits unique characteristics. First, descriptions of both eyes are mixed up in OEMR documents, with both free text and templated asymptomatic descriptions, resulting in sparsity and clutter of information. Second, OEMR documents contain multiple parts of descriptions and have long document lengths. Third, it is critical to provide explainability to the disease diagnosis model. To overcome those challenges, we present an effective automatic eye disease diagnosis framework, NEEDED. In this framework, a preprocessing module is integrated to improve the density and quality of information. Then, we design a hierarchical transformer structure for learning the contextualized representations of each sentence in the OEMR document. For the diagnosis part, we propose an attention-based predictor that enables traceable diagnosis by obtaining disease-specific information. Experiments on the real dataset and comparison with several baseline models show the advantage and explainability of our framework.

translated by 谷歌翻译

KNIFE: Knowledge Distillation with Free-Text Rationales

Aaron Chan , Zhiyuan Zeng , Wyatt Lake , Brihi Joshi , Hanjie Chen , Xiang Ren

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

Free-text rationales (FTRs) follow how humans communicate by explaining reasoning processes via natural language. A number of recent works have studied how to improve language model (LM) generalization by using FTRs to teach LMs the correct reasoning processes behind correct task outputs. These prior works aim to learn from FTRs by appending them to the LM input or target output, but this may introduce an input distribution shift or conflict with the task objective, respectively. We propose KNIFE, which distills FTR knowledge from an FTR-augmented teacher LM (takes both task input and FTR) to a student LM (takes only task input), which is used for inference. Crucially, the teacher LM's forward computation has a bottleneck stage in which all of its FTR states are masked out, which pushes knowledge from the FTR states into the task input/output states. Then, FTR knowledge is distilled to the student LM by training its task input/output states to align with the teacher LM's. On two question answering datasets, we show that KNIFE significantly outperforms existing FTR learning methods, in both fully-supervised and low-resource settings.

translated by 谷歌翻译